-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CI: releasing CLI to PyPI #9452
Conversation
Hi @alamb Can you please give this a quick look? |
Thanks @MohamedAbdeen21. I will review this in the next day or two. |
@MohamedAbdeen21 My first question on this is why would we not do this from the |
Hey @andygrove this is not an issue of Python support, this is making DF more accessible to users through a more popular package manager i.e. pip, and it just happens that the easiest way to do so is through maturin. Can we do it without maturin and the whole python bindings thing? probably, but that will be harder to maintain Can we move this to arrow-datafusion-python? maybe, but I don't see what difference that makes given that there is no rust/python code involved in the deployment process. |
Maybe once we require to manually bind some parts of the CLI before releasing we will have to move it to the python repo, but as for now I see no point in doing so |
My point is that DataFusion is a Rust project and datafusion-cli is already available via cargo, which is the default packaging manager for Rust. If we want to use Python packaging for DataFusion, it seems logical to do that in the DataFusion Python repository. |
Logically it makes sense, but my point is that it doesn't have to be; as long as it can be deployed from the source why go through an extra layer of indirection? I'm looking at pip as just another package manager, not specifically Python's package manager. Feel free to close this PR and re-open it in the Python repo, I'm happy to have DF on pip regardless of where it's deployed from. I'll also remove the test from PyPI once the PR is closed. |
Removed the release from PyPI and closing this issue, to be re-opened in the Python repo by the maintainers. |
Which issue does this PR close?
Closes #9294.
Rationale for this change
Documentation states that datafusion-cli can be installed through pip; however, that is not the case. This adds a workflow to deploy datafusion-cli (and only the CLI, not the python module) to PyPI , allowing installation using pip.
What changes are included in this PR?
This is meant to be a minimal CI, to be reviewed and edited as necessary (handle licensing, tag names, maybe maintainers prefer manual releases, etc)
CC @andygrove as he's involved in arrow-datafusion-python workflows and can have some comments.
A final note: builds for 32-bit systems fail because of
src/main.rs:304
which expects1 << 40
to be usize. This causes cargo to panic and abort builds.Are these changes tested?
You can try testing the deployment yourself using
pip install datafusion-cli
. I thought I changed the name and the repo to reference my fork, but apparently not.I'll remove it from PyPI once the PR is ready to merge, to be redeployed by the maintainers.
Are there any user-facing changes?
Should be able to use pip to install datafusion-cli